Algorithms for Leximin-Optimal Fair Policies in Repeated Games
نویسندگان
چکیده
Solutions to non-cooperative multiagent systems often require achieving a joint policy which is as fair to all parties as possible. There are a variety of methods for determining the fairest such joint policy. One approach, min fairness, finds the policy which maximizes the minimum average reward given to any agent. We focus on an extension, leximin fairness, which breaks ties among candidate policies by choosing the one which maximizes the second-to-minimum average reward, then the thirdto-minimum average reward, and so on. This method has a number of advantages over others in the literature, but has so far been little-used because of the computational cost in employing it to find the fairest policy. In this paper we propose a linear programming based algorithm for computing leximin fairness in repeated games which has a polynomial time complexity given certain reasonable assumptions.
منابع مشابه
A Closed-Form Formula for the Fair Allocation of Gains in Cooperative N-Person Games
Abstract This paper provides a closed-form optimal solution to the multi-objective model of the fair allocation of gains obtained by cooperation among all players. The optimality of the proposed solution is first proved. Then, the properties of the proposed solution are investigated. At the end, a numerical example in inventory control environment is given to demonstrate the application and t...
متن کاملUtilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs
Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...
متن کاملOn Repeated Zero-Sum Games with Incomplete Information and Asymptotically Bounded Values
We consider repeated zero-sum games with incomplete information on the side of Player 2 with the total payoff given by the non-normalized sum of stage gains. In the classical examples the value VN of such N-stage game is of the order of N or √ N as N → ∞. Our aim is to present a general framework for another asymptotic behavior of the value VN observed for the discrete version of the financial ...
متن کاملMyopic versus clairvoyant admission policies in wireless networks
In this paper, we consider a dynamic scenario in which mobile users with elastic traffic arrive at a wireless heterogeneous system according to a Poisson arrival process. The wireless system consists of a set of overlapping network cells of different technologies. The protocols associated to each cell specify the throughput allocated to each user, given the cell load (i.e. the number of active ...
متن کاملQuantitative Fair Simulation Gamest
Simulation is an attractive alternative for language inclusion for automata as it is an under-approximation of language inclusion, but usually has much lower complexity. For non-deterministic automata, while language inclusion is PSPACE-complete, simulation can be computed in polynomial time. Simulation has also been extended in two orthogonal directions, namely, (1) fair simulation, for simula...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008